posted 12-15-2007 03:13 PM
Thanks Barry,I'm taking a break from icy roads and sub-zero temps.
Does anyone know if the Patrick and Iacono 1991 procedures are consistent with RCMP procedures?
Sorry Patric and Iacono 1991 is a different study than the one GM seems to be referring to. It was interesting though, and still provides some retort.
--------------
The authors did not seem to specify what proportion of the exams were DIR (direct - "did you...") and IND (indirect - "dyk...", "did you plan...", "have you personally received...")
GM criticizes the 50% specificity rate, (which was actually 55%).
This is important, because the Blackwell data, also showed poor specificity (and good sensitivity).
Keep in mind that this is as much about procedures as it is about construct validity.
I don't know about RCMP scoring features/criteria. But the DoDPI criteria in use during the 1990s was basically a "score anything that moves" criteria - regarless of Kircher's discriminate analysis in 1988. Present criteria have been improved.
The advantage of long (score anything) criteria lists, is that it increases the sensitivity of the test to deception, because a broad range of possible reactions attract the examiner's concern. (Any of the criteria will move score in the DI direction). The disadvantage of long and poorly researched criteria/feature lists is a reduction in specificity due to the gauntlet effect - one has to not show reaction to anything to move scores in the NDI direction.
So... Duh. The found poor specificity - so did Blackwell 1999 - for probably the same reason.
Generalizing those early findings to modern situations is unsound, and the studies need to be replicated AGAIN.
Another contributor to the specificity deficiencies observed by P&I 1991 is that the IND tests were evaluated using spot rules while the DIR exams were evaluated using total scores. IND exams were multi-facet. DIR were single-issue.
We know now, thanks to Krapohl and others, that the spot rule can cause increased sensitivity at the cost of decreased specificty. We also know now, from Senter and Dollins 2003, that two stage rules correct for that, and permit increased sensitivity without an excess risk of decreased specificity. This is the exact same concern, as discussed a few days ago, involving the addition-rule and inflated alpha contributing to increased FPs. Senter's two-stage rules are procedurally analagous to the mathematical implementation of a Bonferoni correction to the desired alpha. (I'm quite sure he knows that - he just didn't burden us with statistical concepts and simply told us the solution.)
P&I used -2 and +1 for spot cutscores. and +/-6 for totals. We don't really know the p-values for those spot or total scores. However, we do know from Senter 2005? that MGQT/spot scoring rules provide sensitivity in the ballpark reported by Patrick and Iacono - and they appear to have scored the IND exams using MGQT/spot rules. DIR exams were presumably scored with the spot rule.
GM neglected to point out that original examiners substantially outperformed the blind reviewers - esp with the truthful cases. Consistent with other findings. The chi-square analysis showed hit rates significantly above chance for both decetpive and truthful subjects (p = <.01 and p = <.01), though they pointed out a difference between decisions and scores for the original scorers - with the decisions for truthful subjects, based on scores, not significantly above chance (interesting).
The findings are interesting compared with those of Blackwell 1999 who showed accuracy ~98% and ~60% for deceptive and truthful with ZCT exams and worse for MGQT, ~96 and ~25%. Error rates were ~50 and ~30% for truthful, for MGQT and ZCT exams, along with 0% and 2% for deceptive for MGQT and ZCT exams. INCS were ~26% and 6% for truthful and deceptive for ZCT exams and ~22% and ~3% for MGQT exams. Fliess' Kappa was .57 for both.
We know that the bigger is better rule was not taught until the late 1990s, and the Blackwell cases appear to have been scored during 1996.
Read Krapohl and Norris 2000 for more info on how the spot scoring rule affects observed outcomes.
We have no description of the procedures of the original scorers for the P & I cases, and they may have differed from those of the blind scorers.
P&I investigated only crime-type and age, as mitigating variable for false-positive errors, and made no effort to evaluate decision rules or procedures (as Homer Simpson would say: "Doh!").
They should have investigated procedural and decision rule bias.
Krapohl, Shull, and Ryan investigated the confession criterion and found no significant differences in the scores of deceptive persons who confessed and those who did not confess - suggesting no great differences in how the test works with deceptive confessors compared to deceptive non-confessors. I'm no expert in statistical power analysis but there is nothing obviously deficient about the study.
In short, P & I can easily be understood as a critique of methodology - some of which has been or is being rectified.
What P & I ascribe to criterion bias may actually be attributable to a procedural and decision rule bias.
Results from those studies are difficult to compare to modern results, until someone takes the time to re-evaluate those samples using modern testing principles.
r
ASIDE: nopolyforme is beginning to sometimes remind me of digithead, though with less pretense around the statistical expertise he gleaned from NAS. I'm tempted to go back and re-read their past postings.
------------------
"Gentlemen, you can't fight in here. This is the war room."
--(Stanley Kubrick/Peter Sellers - Dr. Strangelove, 1964)